-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
start: Data Access and Data Versioning to mention Model in titles (#2096) #2214
Conversation
- Changes title of "Data Access" to "Data and Model Access" - Changes title of "Data Versioning" to "Data and Model Versioning" - Renames path of Data Access and Data Versioning to `data-and-model-access.md` and `data-and-model-versioning.md` respectively. - Adds redirects -- `/doc/start/data-access` -> `/doc/start/data-and-model-access` -- `/doc/start/data-versioning` -> `/doc/start/data-and-model-versioning` - Replaces links in `/doc/start` with the new links.
I can replace all links to |
@iesahin looks good to me! let me know when it's ready to be merged, I see that you still have a TODO? |
We can also probably add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting there!
Co-authored-by: Jorge Orpinel <[email protected]>
…rg into iesahin/issue2096-take-2
This comment has been minimized.
This comment has been minimized.
…sahin/issue2096-take-2
Co-authored-by: Jorge Orpinel <[email protected]>
…ke-2 Restyle start: Data Access and Data Versioning to mention Model in titles (#2096)
## Model versioning | ||
|
||
DVC helps you to handle model files as well. Models in a project usually change | ||
more frequently than data files and they need to be kept in sync with changes in | ||
other elements of a project. Model files are no different than data files when | ||
it comes to tracking their versions. DVC also provides means to track minor | ||
changes in model files without fully checking in to Git. In later sections of | ||
this series, you'll see how DVC enables to track changes to synchronize multiple | ||
model and data files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing #2214 (review)
I'm still not convinced we need this new section. We already say "data and models" in every section (except in Retrieving — let's fix that though), so if the gist here is that models are also tracked as any file normally, I think that's already implied in every other section.
Also, it can probably be summarized a bit (see feedback below) and then it's too short for a whole section anyway (could be moved to right before Storing and sharing if anything).
- "usually change more frequently than data files" contradicts "are no different than data files" (at first sight)
- "provides means to track minor changes in model files" - vague, what do we mean? run-cache, parameters etc. are not specifically for models.
- The last sentence seems unnecessary.
Okay, now that we've learned how to _track_ data and models with DVC and how to | ||
version them with Git, next question is how can we _use_ these artifacts outside | ||
of the project? How do I download a model to deploy it? How do I download a | ||
specific version of a model? How do I reuse datasets across different projects? | ||
We've learned how to _track_ data files in DVC and how to commit their versions | ||
to Git. Machine learning models are typically large files written and read by | ||
programs. DVC can track and version model files similar to data files. The next | ||
questions are: How can we _use_ these artifacts outside of the project? How do I | ||
download a model to deploy it? How do I download a specific version of a model? | ||
How do I reuse datasets across different projects? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change seems to break the semantics of the paragraph. TBH I don't think it's necessary, we already state "data and models".
I do like the correction to the first sentence, "... , and how to commit their versions to Git" but the next 2 sentences are out of context here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left 2 still-pending questions above ☝️
In order to merge this (the title and link changes should be useful, thanks @iesahin), I'm basically rolling back those 2 things (committing the suggestions below) but feel free to continue discussing/explaining and open another PR if needed.
@shcheklein do we still want to merge this? If so I'll solve the conflicts and do so. Thanks UPDATE: I guess so, since #2096 is labeled |
p.s. everything here is link updates except content/docs/start/data-and-model-versioning.md and a bit more in data-and-model-access.md |
UPDATE: Jump to #2214 (review)
Fixes #2096
data-and-model-access.md
anddata-and-model-versioning.md
respectively.
--
/doc/start/data-access
->/doc/start/data-and-model-access
--
/doc/start/data-versioning
->/doc/start/data-and-model-versioning
/doc/start
with the new links.TODO
data-access
anddata-versioning
with the new paths. (DONE in bb84a99 9ef97c6 2593bb7 9ed0867 3d7d61d )